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A ROAD  MAP  OF  METHODS  FOR  APPROXIMATING  SOLUTIONS 
OF  TWO-POINT  BOUNDARY-VALUE  PROBLEMS 

by 

James  W.  Daniel* 

Abstract 

Numerical  methods  for  approximating  solutions  of  two-point 
boundary-value  problems  for  ordinary  differential  equations  are 
surveyed.  Specific  complete  algorithms  are  classified  according  to 
how  the  original  problem  is  transformed,  how  the  transformed  problem 
is  modeled  discretely,  and  how  the  discrete  model  is  solved.  Relation- 
ships among  various  complete  algorithms  are  presented.  Convergence 
acceleration,  error  estimation  and  control,  and  parameter  selection 
are  also  discussed. 

Key  Words:  Boundary-value,  ordinary  differential  equations, 
numerical  methods. 

1 . INTRODUCTION 

I intentionally  avoided  calling  this  paper  a "survey"  because,  having 
once  worked  as  a surveyor,  I know  that  a survey  of  a city  gives  an  extremely 
detailed  description  of  the  precise  layout  of  the  property  in  that  city 

i 
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and  is  not  very  helpful  to  someone  trying  to  find  his  or  her  way  around 
town.  Analogously,  presenting  all  the  details  of  various  Implemented 
methods  for  solving  boundary- value  problems  can  obscure  the  concepts. 

It  is  also  true,  however,  that  a coarse  aerial  photograph  of  a city  is 
a poor  guide  for  the  lost  traveler,  and,  analogously,  a very  abstract 
model  representing  all  methods  for  boundary-value  problems  is  too  general 
to  impart  much  information.  What  both  the  traveler  and  the  student  of 
numerical  methods  need  is  a useful  roadmap  with  not  only  enough  detail 
to  show  the  various  points  of  interest  but  also  enough  perspective  to 
show  where  these  sites  lie  in  relation  to  one  another.  Here  I present 
my  own  such  roadmap  of  what  some  numerical  methods  for  boundary-value 
problems  are,  of  how  they  relate  to  one  another,  and  of  what  areas  need 
development  in  order  to  improve  methods.  (My  use  of  "I"  as  the  first 
word  of  this  paper  is  also  intentional:  the  reader  is  to  be  warned  that 
this  is  the  personal  view  of  one  individual.) 

Now,  what  kinds  of  boundary-value  problems  are  we  to  consider?  In 
the  spirit  of  the  first  paragraph  1 want  to  present  neither  a single 
abstract  problem  including  all  cases  nor  a vast  list  of  specific  special 
problems.  I will  discuss  instead  a couple  of  model  problems  for  which 
the  solution  methods  will  share  many  features  with  methods  for  the 
panorama  of  distinct  problem  types:  eigenvalues,  non-linear  boundary 
conditions,  m-th  order  equations,  systems  of  equations  for  vector-valued 
functions,  mixed-order  systems,  infinite-intervals  for  the  Independent 
variable,  singular  problems,  singular-perturbation  problems,  multi-point 
boundary  conditions,  et  cetera.  I will  consider  both  the  first-order 
system  for  the  nxl  vector  valued  function  y: 
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(1.1)  y' (t)  - f (t,^(t))  forO<t<l 

and  the  second-order  scalar  equation; 

(1.2)  y"(t)  - f(t,y(t),y’ (t))  for  0 < t < 1 

since  numerical  methods  on  the  first-order  system  equivalent  to  (1.2) 
usually  are  dramatically  less  efficient  than  methods  directly  Intended 
for  second-order  problems;  note  that  I restrict  myself  to  a finite  range 
for  the  independent  variable  and  I use  0<  t<  1 as  a canonical  interval. 
Boundary  conditions  for  (1.1)  are  given  by  n nonlinear  equations 
involving  y(0)  and  y(l): 

(1.3)  b(y(0)  , y(l))  - 0 

■ « m • 

in  vector  notation.  For  (1.2)  we  give  two  nonlinear  equations  relating 
y(0),  y'(0),  y(l),  and  y'(l),  which  we  can  express  in  vector  notation: 

(1.4)  s(y(o).  y'(o).  y'd))  - 2 • 

In  many  cases  the  boundary  conditions  will  in  fact  be  linear,  in  which 
case  we  replace  (1.3)  with 

(1.5)  Bq2(0)  + “ S 

where  ^ and  B,  are  nxn  and  e is  nxl,  while  we  replace  (1.4)  with 

■U  •!  « 


(1.6) 


CQy(O)  + (jQy'  (0)  + Sj^yd)  + dj^y'  (D  - 2 
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where  and  a are  all  2x1;  these  special  forms  can 

h-  useful  computationally.  Another  common  and  computationally  advan- 
tageous situation  Is  that  In  which  the  boundary  conditions  are  separated, 
so  that  conditions  at  t-0  and  at  t»l  do  not  Interact.  In  this 
case  we  can  write  (1.3)  and  (1.5)  as 

" So*  gl2(l>  " £i 

where  Is  qxn,  Is  (n-q)xn,  e^  Is  qxl,  and  e^^  Is  (n-q)xl, 

for  some  Integer  q with  1 < q < n.  Similarly  (1.4)  and  (1.6)  become  In 
the  separated  case 


(1.8) 


Coy(O)  + dQy'(O)  - Sq,  Cj^y(l)  + d^y' (1)  - aj^. 


Thus  we  will  be  considering  either  (1.1)  with  one  of  the  boundary  conditions 
(1.3),  (1.5),  (1.7)  or  (1.2)  with  one  of  the  boundary  conditions  (1.4), 
(1.6),  (1.8).  In  the  Interest  of  time  and  space  we  often  will  discuss 
a method  as  applied  to  either  the  first-order  or  the  second-order  problem 
when  the  analogous  use  of  the  Idea  of  the  method  for  the  other  standard 
problem  Is  fairly  straightforward. 


The  next  task  and  the  main  task  of  this  paper  Is  to  describe  how  to 
classify  various  methods.  The  "aerial  photograph"  approach  would  be  to 


note  that  the  problem  Is  simply  to  solve  F(jj)  “ 0 for  x In  some 
appropriate  abstract  space  and  F some  nonlinear  operator,  while  numerical 


methods  eventually  solve  some  discretization  for  lo 

some  discretized  (flnlte-dlmenslonal)  space;  this  doesn't  really  tell  us 
much  about  the  structure  of  various  specific  methods.  The  "survey" 


approach  would  be  to  describe  computer  codes  Implementing  various  specific 
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methods;  this  gives  us  more  detail  than  we  can  absorb.  Instead  I will 
give  a "road  map"  approach  which  defines  a complete  method  as  having 
three  aspects: 

1)  a Transformed  Problem. 

2)  a Discrete  Model  of  the  Transformed  Problem,  and 

3)  a Solution  Technique  for  the  Discrete  Model. 

In  Section  2 of  this  paper  I describe  various  Transformed  Problems 
equivalent  to  (1.1)  or  (1.2)  and  their  boundary  conditions.  Section  3 
presents  approaches  for  creating  Discrete  Models,  while  Section  4 outlines 
some  Solution  Techniques.  In  Section  5 complete  methods  generated 
by  different  choices  of  1),  2),  and  3)  above  are  compared  and  some  are 
shown  to  be  equivalent  to  others.  Sections  6 and  7 briefly  discuss  the 
Important  notions  of  accelerating  convergence  and  estimating  errors,  and 
of  controlling  errors  by  selecting  the  parameters  upon  which  various 
complete  methods  depend.  A brief  conclusion  appears  in  Section  8. 

Section  9 contains  a brief  collection  of  references.  Since  I am  not 
attempting  here  to  give  a historical  or  developmental  view  of  methods, 

I will  not  present  a detailed  bibliography  but  rather  will  indicate 
selected  references  where  more  detail  and  more  references  may  be  found. 

Good  general  sources  of  references  are  [Keller  (1968,  1975),  Aktas- 
Stetter  (1977)  ] . 

I thank  the  many  colleagues,  especially  Victor  Pereyra  and  Andy  Vfhlte, 
who  over  the  years  have  influenced  my  view  of  numerical  methods  for 
boundary-value  problems.  Although  I would  gladly  blame  mistakes  on  my 
colleagues  and  take  credit  for  any  Insights,  unfortunately  I must  accept 
responsibility  for  all  the  views  expressed  in  this  paper. 
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2 . TRANSFORMED  PROBLEMS 

The  basic  notion  of  this  paper  is  that  a conplete  method  can  be 
viewed  as  the  straightforward  application  of  a fairly  standard  discreti- 
zation process  to  a Transformed  Problem  that  is  equivalent  to  the  original 
boundary-value  problem.  From  this  perspective  we  will  describe  the 
classical  shooting  method,  for  example,  as  the  application  of  standard 
numerical  methods  for  initial-value  problems  and  for  nonlinear  equations 
to  the  Transformed  Problem  of  finding  the  proper  initial  values  so  as  to 
satisfy  the  boundary  conditions.  In  this  Section  we  examine  several 
Transformed  Problems. 

2.1  No  transformation.  For  completeness  we  mention  the  trivial  case  of 
applying  no  transformation,  so  that  the  Transformed  Problem  and  the 
original  problem  coincide.  This  allows  us  to  discuss  complete  methods 
which  appear  to  be  frontal  assaults  on  the  boundary-value  problem. 

2.2  Variational  problems.  Many  boundary-value  problems  arise  in  the 
physical  sciences  as  the  variational  or  Euler-Lagrange  equations  for 
problems  in  the  calculus  of  variations  [Courant-Hilbert  (1953)].  For 
example,  consider  Che  problem  of  minimizing 

(2.1)  J[y]  - i (y'(t))2  + F(t,y(t))}dt 

Jq 

for  all  sufficiently  smooth  functions  y satisfying  the  linear  and 
separated  boundary  conditions 


(2.2)  y(0)  ■ a,  y(l)  - b . 

Then  Che  theory  of  the  calculus  of  variations  shows  that 


I 
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(2.3) 


y"(t)  - f(t,y(t))  for  0<t<l,  y(0)  = a,  y(l)  » b 


where  f(t,y)  « ^ F(t,y).  Thus  the  calculus  of  variations  problem  of 

minimizing  J in  (2.1)  reduces  to  a special  form  of  the  boundary-value 

problem  (1.2)  in  which  no  y'  term  appears.  Conversely,  certain 

9f 

boundary-value  problems  (2.3) — for  example,  those  in  which  ■^^0  for 

dy 

all  t and  y — are  equivalent  to  minimizing  J[y]  and  are  perhaps  more 
naturally  described  as  such  variational  problems.  In  this  case,  we 
change  the  original  boundary-value  problem  to  the  Transformed  Problem: 
find  a function  y minimizing  J[y]  in  (2.1)  subject  to  conditions  (2.2) 


2.3  Shooting  and  its  variants.  We  consider  first  here  the  simplest  form 
of  shooting  applied  to  the  first-order  system  (1.1)  with  boundary  condi- 
tions (1.3),  that  is 

y'  * 0<t<l,  b(y(0),y(l))  = 0 . 


We  choose  an  initial-value  vector  z that  is  n x 1 and  let  y(t;z)  solve 

B B ^ 

(1.1)  subject  to  the  initial  condition 
y(0;z)  - z . 


The  original  problem  now  can  be  restated  as  the  Transformed  Problem:  find 
an  n X 1 vector  g so  that  ]2(i>  y(l;2))  **  P*  We  have  replaced  a boundary- 
value  problem  for  differential  equations  with  a single  system  of  nonlinear 
equations  (and  an  intermediate  problem  of  finding  ^(l;z)).  It  is  easy  to 
see  one  of  the  potential  difficulties  in  using  simple  shooting  (Independent 
of  the  numerical  technique  used  to  compute  ^(l;j))  by  examining  a variant 
of  shooting  called  superposition  [Scott-Watts  (1977)]. 
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Suppose  that  our  first-order  system  (1.1)  is  linear  with  linear 
boundary  conditions  (1.5)  so  that 

(2.4)  y'(t)  = A(t)y(t)  + g(t)  for  0<t<l,  BQy(O)  + S^y(l)  “ g 

where  A is  nxn.  We  let  Y(t)  be  the  nxn  fundamental  solution  matrix 
satisfying 

Y'(t)  = A(t)Y(t)  for  0<t<l,  Y(0)  - I 

and  let  the  n x 1 vector  p be  any  particular  solution  to 
g’ (t)  = A(t)p(t)  + g(t)  for  0<t<l  . 

Then  every  solution  ^ to  (2.4)  is  of  the  form 

(2.5)  y(t)  = p(t)  + Y(t)x  for  0<t<l 

for  some  nxl  vector  x independent  of  t;  this  representation  merely 
writes  y as  p plus  a linear  combination  of  a linearly  independent 
set  of  solutions  to  the  homogeneous  equations.  The  boundary  conditions 
now  merely  become  the  linear  algebraic  equations  for  x: 


To  determine  it  ia  essential  of  course  that  2o'*’Sl«^^^  non- 
singular;  the  difficulty  can  be  that  the  numerical  computation  of  Y 

k causes  the  crucial  matrix  to  become  singular.  For  example,  while 

♦ 
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is  non-singular  for  every  t,  because  of 


rounding  errors  a computer  representation  of  this  matrix  will  become 


the  singular  matrix 


for  t much  larger  than  unity. 


Thus  simple  superposition  can  have  difficulties;  since  from  (2.5)  we  see 
that  y(0)  = p(0)+x,  the  coefficients  x are  nearly  the  initial  values 

« as  ® ® 

z of  shooting,  and  indeed  x = z if  p(0)  =0,  so  we  see  that  the  same 

* * * as  » 

potential  difficulty  is  inherent  in  shooting. 

From  the  shooting  viewpoint,  a way  out  of  this  problem  is  multiple 

shooting  [Keller  (1968]  in  which  we  we  simultaneously  shoot  from  k 

distinct  t values  0 = 'I^<  • * • < < 1.  That  is,  for  arbitrary  nxl 

vectors  z.,*-*,z,  we  let  y4(t;z.)  for  l^i£k  solve  y ! (t)  - f (t  ,y  (t) ) 

for  T^  < t < T^^^  (with  = 1)  subject  to  the  initial  conditions 

»i^^l’»i^  ”»i’  original  problem  now  can  be  restated  as  the  Transformed 

Problem:  find  k nxl  vectors  z,,‘’',z,  so  that  b(z  ,y  (l;z,  ))  = 0, 

•=i  =K  = “i  sic 

^^(T^^j^;z^)  “ l5l$k-l,  a system  of  kn  equations  for  the  kn 

unknown  components  of  hope  is,  of  course,  that  the  z^ 

and  Tj^  can  be  chosen  so  shrewdly  that  the  numerically  singular  matrices 
mentioned  in  the  preceding  paragraph  do  not  arise. 

From  the  superposition  viewpoint  we  can  use  multiple  starting  points 
Tj  ■ 0 < T2<  • • • < Tj^  < 1 as  well.  We  merely  represent  ^ for  T^  5 t 1 
by  ^(t)  “ where  p^^  is  a particular  solution  of  the 

inhomogeneous  equation  on  ^ ^ ^ ® fundamental  solution 

there  solving  with  q for  some  given  non-singular 
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matrix  q.  Recalling  that  we  need  to  satisfy  the  linear  boundary  conditions 
^ -1^^^^  “I  continuity  conditions  for  y(Tj^)  which  become 


(2.7) 


we  see  that  we  again  have  kn  (linear)  equations  for  the  kn  unknown 


components  of  Since  the  unknowns  In  (2.7) 


we  see  that  the  system  of  linear  equations  we  must  solve  Is  essentially 
block  bi-dlagonal  with  0 appearing  as  the  blocks. 


Just  as  for  simple  shooting  and  simple  superposition,  if  we  choose  Y n“I 

“1 , U “ 


and  Pj (T . ) = 0 then  the  coordinates  Xj  in  multiple  superposition  equal 

» i 1 * “1 


the  Initial  values  in  multiple  shooting.  Other  choices  of  Y^  ^ and 


PlCTf)  are  possible  however.  In  the  so-called  re-orthogonallzatlon  method 
[Scott-Watts  (1977)],  q * I and  each  subsequent  Y^  ^ is  chosen  as  the 


Gram-Schmidt  orthogonallzed  version  of  Thus  we  decompose 


Yi_i(Ti)  as 


^l-l^Ti>“2iSi 


for  orthogonal  and  upper-right  triangular  and  then  let  ^^j^(Tj^) 


Y^  O^^i'  this  case  the  equations  (2.7)  for  the  coefficients 


Xj^,***,Xj^  become 


2l_l(Ti) +2iRiXi_i  - 0 + 2iXi  . 


and  multiplying  by  the  inverse  of  the  orthogonal  matrix  gives 


(2.8) 


2i  £l-l^^i^  * 5i2l-l  “^i 


“T  ' 
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Thus  in  the  case  of  the  re-orthogonalizat ion  method  the  nearly  block 
bi-diagonal  matrix  describing  the  equations  for  25]^*’ ’‘*^1^ 

very  simple  blocks  I and  R. , making  solution  of  the  system  quite  simple. 

Whatever  variants  we  take  of  simple  shooting,  they  all  reduce  the 
original  boundary-value  problem  to  a Transformed  Problem  of  determining 
a finite  set  of  numbers — z or  x or  or  Xj^,''',Xj^. 


2.4  Quasi-linearization.  Mathematicians  are  well  known  as  people  who,  when 
they  cannot  solve  a certain  difficult  problem.  Instead  solve  some  easy  pro- 
blem in  the  hope  that  this  will  somehow  be  profitable.  Quasi-linearization 
[Bellman-Kalaba  (1965)1,  often  known  as  Newton's  method,  is  the  application 
of  this  ploy  to  create  Transformed  Problems  simpler  than  an  original  diffi- 
cult nonlinear  boundary-value  problem.  The  idea  is  that,  given  one  approximate 
solution  to  the  first-order  system  (1.1),  with  general  boundary 

conditions  (1.3),  we  approximate  the  differential  equation  (1.1)  for  y 

a 

near  y^  by  the  linear  (in  differential  equation 


3f 


(2.9) 


yl+l^'')  + JZ  (t,y^(t))(y^^j(t)-y^(t)), 


3f 


where  is  the  nxn  Jacobian  matrix  of  f with  respect  to  Similarly 

we  approximate  the  boundary  conditions  (1.3)  by  the  linear  (in  boundary 


conditions 


(2.10) 


b(y,(0),j;,(l))+BQ^^(y^^^(0)-y^(0))+B^^^(y,^,(l)-y,(l))-2 

-0,i  “ 3y(0) 

3b 

5i.i“ 


The  equations  (2.9),  (2.10)  comprise  a linear  boundary-value  problem  for 
y^^j^  of  the  same  form  as  (2.4).  As  usual  with  Newton's  method,  the  hope 
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is  that  the  sequence  of  functions  yQ.y^^,...  converges  to  y solving 
(1.1),  (1.3);  under  reasonable  conditions  on  f and  b this  does  Indeed 
occur  if  the  initial  guess  yr.(t)  is  sufficiently  close  to  y(t).  Thus 
we  have  reduced  the  original  nonlinear  boundary-value  problem  to  the 
Transformed  Problem:  solve  a sequence  of  linear  boundary-value  problems 
for  yo’yi’---  converging  to 

2 

At  this  point  notice  that  we  can  easily  speak  of  Transformed  Problems 
whenever  the  Transformed  Problem  is  transformed  again;  this  occurs  for 
example  if  we  use  shooting  or  superposition  as  in  Subsection  2.3  to  solve 
each  of  the  linear  boundary-value  problems  (2.10).  Some  methods  designed 
only  for  linear  problems  can  thus  be  used  with  quasilinearization  on 
nonlinear  problems  I for  example,  see  Scott  (1975),  Scott-Watts  (1977)]. 

2.5  Continuation  and  embedding.  In  many  real  problems  the  differential 

equation  (and  perhaps  the  boundary  conditions)  depends  on  certain  physical 

parameters,  and  solutions  are  desired  over  a range  of  values  of  these 

parameters.  If  this  is  not  the  case,  it  is  usually  possible  to  think  of 

the  problem  as  being  for  one  specific  value,  say  X_,  of  some  physically 

meaningful  parameter  A which  we  can  imagine  being  allowed  to  vary.  In 

rare  instances  an  artificial  parameter  may  need  to  be  introduced  by,  for 

example,  considering  y'  =f(t,y)  as  the  problem  when  A = A„  “ 1 is  sub- 

w * » F 

stituted  in  the  equation  y*  "Af(t,y).  In  any  case  we  suppose  that  we  have  a 
family  of  differential  equations 

(2.11)  y'-f(t,y;A) 

m m 

whose  solution  y(t;A)  is  especially  desired  for  A «■  A and  perhaps  for 

m F 

many  other  values  of  A as  well. 


In  the  continuation  method  we  assume  that  (2.11)  can  be  solved 


easily  for  some  value  Aq  of  the  parameter.  Assuming  Xq < Ap  for 
convenience,  we  then  set  out  to  solve  (2.11)  for  a sequence  of  k values 
of  A,  say  " ^0^^2  ^1+1  "near"  A^,  the  hope  Is 

that  (2.11)  for  ^ solved  fairly  easily  by  making  use  of 

the  already  obtained  solution  y(t;A  ) at  A=A..  Thus  we  have  replaced 
the  original  boundary-value  problem  (1.1)  by  the  Transformed  Problem: 
solve  (2.11)  for  a sequence  Aq, A^, ’ * • , Aj^  of  values  of  the  parameter  A. 

This  is  a useful  device  whenever  one  can  use  y to  advantage  in  obtaining 
since  any  numerical  method  will  require  solving  some  nonlinear 
equations  for  the  approximation  to  the  approximation  obtained  for 

y can  usually  be  used  as  the  first  iterate  in  an  iterative  method  for 

m X 

solving  these  nonlinear  equations  for  ^or  example,  quasi-linearization 

might  be  used  on  (2.11)  at  ^ with  the  solution  at  A = A^  as  the 

starting  iterate. 

Another  approach  to  solving  (2.11)  for  ^ “ Xp  is  to  use  the  embedding 
method  [Scott  (1973,1975)]  which  derives  differential  equations  for  the 
dependence  of  ^(t;A)  on  A.  While  this  usually  results  in  nonlinear 
partial  differential  equations  for  ^ as  a function  of  t and  A,  the 
side  conditions  often  can  be  chosen  to  be  initial  conditions  since  we 
assumed  the  problem  to  be  easily  solved  initially  at  A Aq.  Thus  the 
original  ordinary  differential  equation  boundary-value  problem  is  replaced 
by  the  Transformed  Froblem:  solve  an  initial-value  problem  for  a partial 
differential  equation  involving  ^ as  a function  of  t and  A. 

A wide  variety  of  these  embedding  methods  have  been  used  depending 
on  precisely  how  A enters  the  differential  equation.  A very  common 
practice  is  to  use  the  interval  length  over  which  t varies  as  the 


embedding  parameter;  it  is  for  this  case  that  I will  restrict  the  use  of 


the  broad  term  invariant  embedding  [Scott  (1973,1975)1.  Thus  we  think 
of  the  family  of  problems  defined  for  0<t<X  and  we  let  A range  from 
zero  to  unity.  For  (1.1)  subject  to  (1.3),  for  example,  we  Instead  consider 

“ f(t,y)  for  0<t<A  with  b(y(0),y(A))“0. 

When  A»Aq=0  this  reduces  to  the  system  of  n equations  b (y (0)  ,y (0) ) “ 0 
for  the  n unknown  components  of  ^(0;Aq)  which  is  assumed  to  be  our 
"easy"  problem.  When  the  original  differential  equation  (or  boundary  con- 
dition) is  nonlinear.  Invariant  embedding  yields  a nonlinear  partial 
differential  equation.  To  avoid  this  difficulty  the  computationally  most 
successful  approach  appears  [Scott  (1973,  1975)]  to  be  to  develop  invariant 
embedding  for  linear  problems  (which  turn  out  to  lead  to  ordinary  differen- 
tial equations  when  invariant  embedding  is  used)  and  then  to  use  quasi- 
linearization  as  a device  for  replacing  nonlinear  problems  by  a sequence 
of  linear  problems,  each  of  which  is  transformed  and  solved  by  Invariant 
embedding;  of  course  this  can  be  viewed  as  just  one  possible  device  for 
solving  the  nonlinear  partial  differential  equation.  For  the  rest  of  this 
section  we  restrict  ourselves  to  a consideration  of  the  linear  second-order 
equation 

(2.12)  y"(t) +p(t)y’ (t) +q(t)y(t)  - g(t)  for  0 < t < A 

subject  to  separated  linear  boundary  conditions 

(2.13)  CQy(O) +dQy' (0)  -a^,  Cjy(A)  + d^y ' (A)  - a^ 

as  in  (1.8);  the  solution  to  (2.12),  (2.13)  is  y(t;A)  and  it  is  desired 
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for  0-X-l.  General  linear  boundary  conditions  can  b e handled  as  well. 

The  simplest  invariant  embedding  method  for  (2.12),  (2.13)  is  the 
sweep  or  factorization  method  in  which  we  introduce  two  auxiliary  functions 
a(t)  and  g(t)  for  OitSl  and  set  y'  “ay  + S;  It  turns  out  that  a 
and  g must  satisfy 


(2.14) 


a' (t)  = -q(t)  - p(t)a(t)  - a^(t)  forO<t<l,  a(0) 
g' (t)  = g(t)  - (p(t) +a(t))g(t)  forO<t<l,  e(0) 


(If  dQ»0,  a slightly  different  method  is  used.)  Having  computed  a and 
g from  (2.14),  applying  y'  “ay + 6 at  t = X along  with  the  boundary 
condition  c^^y (X)  + d^y ' (X)  = gives  us  two  linear  equations  which  we 
solve  for  the  unknowns  y(X;X)  and  y'(X;X)-  Having  found  y(X;X)  and 
a(t),g(t)  for  O^t^X-1  we  finally  solve 

(2.15)  y'(t;X)  = ot(t)y(t;X)  + 3(t)  for  0<t<X 

in  the  backward  direction  starting  from  the  recently  found  value  y(X;X) 
for  y(t;X)  at  t • X-  This  gives  the  desired  solution  y(t;X)  for 
OSt^X.  Note  that  the  initial-value  problems  for  a and  g need  only 
be  solved  once.  Thereafter,  to  find  y(t;X)  for  any  X only  requires  the 
solution  of  the  two  linear  algebraic  equations  for  y(X;X)  and  then 
Integration  of  one  backwards  initial-value  problem  (2.15)  for  y. 

Experience  has  Indicated  [Scott  (1975)]  that  a somewhat  more  complex 
Invariant  embedding  method  is  better  than  the  sweep  method  above.  In 
this  version  four  auxiliary  functions  r^^,  r2,  Sj^,  and  82  are  Introduced 


in  such  a fashion  that 
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kt 


a 


(2.16) 


y(t;X)  ='rj^(t)y'(t;X)+r2(t) 
y' (0;X)  “ Sj^(t)y’ (t;X)  + 32(t)  . 


From  (2.16)  we  see  that  we  know  r^,  r2,  s^,  and  S2  for  05t-l 
and  If  we  know  y* (0;A)  then  (2.16)  gives 


(2.17) 


y(t;X)  = rj^(t) 


y' (0;X)  - S2(t) 

Sj^(t) 


+ r2(t) 


which  expresses  y(t;X)  in  terms  of  known  quantities.  To  determine 

rf,  r2,  8^,  S2,  and  y'(0;X)  we  proceed  as  follows.  We  find  r^,  r2,  Sj^,  S2 

for  05  til  by  solving  the  initial-value  problems 


(2.18) 


r^(t)  = l+p(t)rj^(t) +q(t)r^(t),  rj^(0)-0, 

r2(t)  = q(t)r^(t)r2(t) , r2(0)=l, 

s|^(t)  = (p(t) +q(t)rj^(t))s^(t) , Sj^(0)»l, 

S2(t)  = (-g(t) +q(t)r2(t))Sj^(t),  82(0)  =0. 


To  obtain  y’(0;X)  we  use  the  two  equations  of  (2.16)  for  t = X and  we 
also  use  the  two  boundary  conditions  (2.13);  this  gives  four  linear 
algebraic  equations  from  which  we  evaluate  the  four  unknowns  y(0;X), 
y'(0;X),  y(X;X),  and  y'(X;X).  Note  again  that  the  initial-value  problem 
for  rj^,  r^,  s^^,  and  s^  are  solved  only  once;  thereafter  for  any  value 
of  \ we  need  only  solve  the  four  algebraic  equations  to  find  y' (0;X) 
and  substitute  into  (2.17)  to  obtain  the  full  solution  y(t;X). 


I 
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Thus  for  both  the  sweep  method  and  Scott's  version  of  Invariant 
embedding,  we  replace  the  original  boundary-value  problem  by  the 
Transformed  Problem:  solve  three  or  four  initial-value  problems  (for 
ordinary  differential  equations)  and  one  small  linear  system  of  algebraic 
equations . 

2.6  Integral  Equations.  By  using  an  appropriate  Green's  function  we  can 
transform  our  original  boundary-value  problem  into  an  integral  equation. 

As  an  Illustration,  consider  the  second-order  problem  (1.2)  subject  to 
linear  boundary  conditions.  Usually  by  subtracting  from  y an  appropriate 
linear  function,  we  can  force  the  boundary  conditions  to  be  homogeneous. 

We  therefore  consider 

(2.19)  y"(t)  - f(t,y(t),y'(t))  for  0<t<l 

subject  to  the  homogeneous  version  of  (1.6),  namely 

(2.20)  £0^(0)  + dQy'(O)  + c^y(l)  + dj^y' (D  “ 0 

If  y*0  is  the  only  solution  to  y"  * 0 subject  to  (2.20)  then  we  can 
find  the  Green's  function  G(t,T)  sc  Laat  y"(t)  »•  g(t)  and  y satisfies 

(2.20)  if  and  only  if 

y(t)  « [ G(t,T)g(T)dT. 

Jo 

Applying  this  fact  to  g(t)  - f (t,y (t) ,y' (t) ) in  (2.19)  yields  the  fact 
that  y solves  (2.19),  (2.20)  if  and  only  if  y solves 

y(t)  - [ G(t,T)f (T,y(T),y' (T))dT 
^0 


(2.21) 


for  0 5 t $ 1 
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In  this  generality  we  have  replaced  (2.19),  (2.20)  by  an  Integro- 
differentlal  equation.  For  the  important  class  of  problems  in  which  f 
is  independent  of  y',  (2.21)  becomes  the  integral  equation 

(2.22)  y(t)  = I G(t,T)f (T,y(T))dT  for  0<t£l. 

Thus  we  replace  the  original  boundary-value  problem  by  a Transformed 
Problem:  solve  for  y in  the  integro-dif ferential  equation  (2.21)  or 
in  the  integral  equation  (2.22) 

3.  DISCRETE  MODELS  OF  TRANSFORMED  PROBLEMS 

We  have  seen  a few  of  the  many  ways  in  which  our  original  boundary- 
value  problem  can  be  transformed  into  an  equivalent  problem;  now  we 
want  to  discuss  methods  for  launching  a frontal  assault  on  the  Transformed 
Problem.  Although  there  are  other  methods  available  [ Aktas-Stetter  (1977)], 
I will  restrict  mvself  to  the  most  successful  methods,  namely  finite 
differences  and  projections,  as  approaches  to  discrete  modeling. 

3.1  Finite  Differences.  The  basic  idea  here  is  to  represent  desired 
functions  g(t)  for  0<t<l  by  the  values  of  g at  some  finite  set  of 
points  05tj^<  t2  < • • • < < 1.  We  approximate  g(t^)  by  some  number 

and  generate  relationships  among  the  values  G^  intended  to  model  what 
the  (transformed)  problem  tells  us  about  g.  Such  modeling  methods  are, 
of  course,  very  well  known:  we  generally  replace  derivatives  by  divided 
differences,  integrals  by  quadrature  sums,  et  cetera.  We  look  briefly 
at  the  models  that  result  when  finite  differences  are  used  with  the 


Tranformed  Problems  of  Section  2. 
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Finite  Differences  for  the  Original  Problem 

Basic  finite  differences  for  the  original  boundary-value  problem 
are,  of  course,  very  well  known.  For  the  first-order  system  (1.1),  for 
example,  we  discretize  by  letting  t = 0 < t2  ^ ^ * 1 and  letting 

the  nxl  vector  approximate  y(t^).  Two  simple,  natural,  and 

effective  schemes  are  to  model  the  differential  equation  (1.1)  either  by 


(3.1) 


^»i+l  ”=1^^^'^1+1  *^i^  “ 2 l^*^i+l’-i+l^  ’ 


for  1 1 1 $ N-1 


or  by 


(3.2) 


for  1 5 1 5 N-1 


and  to  model  the  nonlinear  boundary  conditions  b(y(0),y(l))  *0  in  (1.3)  by 

(3.3)  b(Z^,Zj^)-0  . 

Under  reasonable  hypotheses,  of  course,  this  is  a second-order  method, 
that  is, 

(3.4)  llji  “ y ^*-i^lloo  - for  1-1-N,  c Independent  of  N, 

where  throughout  this  paper  we  use  h to  denote 

(3.5)  h - max{  I t^^j^  - t^  I ; 1^1  ^N-l). 
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HiRher  order  methods  of  course  exist  using  more  complicated  difference 
approximations  for  y'  and/or  more  complicated  suras  of  values  of  f. 

e * 

Such  methods  can  even  be  generated  automatically  as  in  the  HODIE  method 
[Lynch-Rice  (1977)]. 

Since  the  nonlinear  equations  (3.1),  (3.3)  or  (3.2),  (3.3)  are 
often  solved  by  some  linearization  process  and,  since  linear  problems  also 
arise  naturally,  it  Is  Instructive  to  look  briefly  at  the  structure  of, 
say,  (3.2)  and  (3.3),  when  the  problem  is  linear.  We  therefore  consider 
again  (2.4),  namely 

y'(t)  - A(t)y(t)  + g(t),  BQy(O)  + B^y(l)  = e . 

Writing  - t^^^  - t^,  A(t^  + h^/2) , and  - |(t + h^/2) , the 

equations  (3.2),  (3.3)  take  the  form 

+ h^g^  for  1 <1  <N-1 

Ml  + iA  ■ 8- 

In  block  matrix  notation,  this  is 


involvlnji  an  almost  fexceot  for  the  first  block  row)  bi-diagonal  matrix, 
where  i , M^-I-h^^Aj^.  Although  N may  be  quite  large  in 

order  to  obtain  much  accuracy,  the  special  structure  of  the  Nn x Nn  matrix 
in  (3.6)  allows  the  system  to  be  solved  efficiently. 

Finite  Differences  for  Variational  Problems 

We  saw  in  Subsection  2.2  that  second-order  problems  (1.2)  subject  to 
separated  linear  boundary  conditions  (2.2)  and  not  involving  y'  explicity 
in  the  differential  equation  are  often  equivalent  to  minimizing  J [y]  in 
(2.1).  Since  J involves  an  integral  and  a derivative,  the  natural  finite 
difference  approach  is  to  use  a quadrature  sum  and  a divided  difference. 

A simple  example  is  to  let  the‘  n x 1 vector  approximate  y(tj)  and  to 

let  the  vectors  be  chosen  to  minimize 

N-1 

(3.7)  J^lZl-  21  2 f(^i+l-^l>/(‘^i+l-^i>l  +F(t^,Z^)} 

i*l 

subject  to 


More  complicated  differences  or  quadratures  lead  to  more  involved  functions 
[J^  ] to  be  minimized. 

Finite  Differences  for  Shooting  and  its  Variants 
All  of  the  variants  of  shooting  described  in  Subsection  2.3  involve 


solving  some  initial-value  problems  for  ordinary  differential  equations 
as  an  intermediate  step  on  the  way  to  solving  a set  of  algebraic  equations. 
Finite  differences  can  come  Into  play  be  providing  a way  to  solve  these 
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initial-value  problems  approximately.  Any  of  the  high  quality  initial- 
value  codes  such  as  those  for  variable-order  Adams  methods  or  Runge-Kutta- 
Fehlberg  methods  can  be  used  to  solve  the  initial-value  problem.  As  a 
trivial  example  for  multiple  shooting  we  can  replace  y!  = f(t,y  (t))  for 
Ti<t<Ti^l  and  l<l<k-l  by 


(3.8) 


2*^1, j+l"*"  2^i,j+l'^  2=1, 


where  Z approximates  y(t  .)  and 
=1.1  =i»j 


and  we  can  replace  the  initial  condition  with 

(3.9)  . 


Given  the  initial  data  for  multiple  shooting  we  can  use  (3.8)  to 

evaluate  successively  Z^  2’^i  3’  '"’=i  N example, 

in  the  linear  case  in  which  the  original  equation  is 

y'  (t)  - A(t)2(t)  + g(t)  for  0<  t<  1,  BQy(O)  + B^^d)  “ e 
and  in  (2.4),  the  recursion  (3.8)  becomes  merely 


(3.10) 


”l,j^i,j+l"^i,j^i,j  “\,j®l,j 


i 
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Finite  Differences  for  Quasilinearization 

Quasi  linearization  discussed  in  Subsection  2.4  merely  transformed 
the  original  nonlinear  problem  to  a sequence  of  linear  boundary-value 
problems.  We  can  therefore  use  finite  differences,  for  example  as  in 
(3.6),  to  solve  each  of  these  linear  problems.  In  many  cases  using 
finite  differences  on  the  linearized  differential  equation  as  described 
here  is  the  same  as  linearizing  the  nonlinear  equations  that  result  from 
applying  finite  differences  to  the  nonlinear  differential  equation. 

Finite  Differences  for  Continuation  and  Embedding 

Finite  differences  can  be  used  on  the  continuation  problem  just  as 
it  was  on  the  original  (un transformed)  problem.  On  the  other  hand,  for 
the  embedding  methods  we  needed  to  solve  initial-value  problems  like 
(2.14),  (2.15),  or  (2.18).  High  quality  finite  difference  methods  for 


I 


initial-value  problems  can  therefore  be  applied  to  solve  these  just  as 
for  shooting  methods. 
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Finite  Differences  for  Integral  Equations 

We  saw  in  Subsection  2.6  how  we  could  transform,  for  example,  (2.19)- 
(2.10)  with  f independent  of  y'  into  the  integral  equation  (2.22), 
namely 

y(t)  = I G(t,T)f (T,y(T))dT 
•*0 

Letting  t * 0 < t2  < * • * < t^  • 1 and  approximating  y(t^)  by  as  usual, 

we  can  replace  the  Integral  in  the  equation  with  a quadrature  sum.  Using 
the  simple  rectangle  rule,  for  example,  yields 

N-1 

(3.13)  (t^^^-tj)G(t^,tj)f(tj,Z^)  . 

As  usual,  more  complicated  quadrature  formulas  yield  solutions  of  higher 
order  accuracy. 

3.2  Projections.  As  we  saw  in  Subsection  3.1,  various  discrete  problems 
result  from  using  finite  differences  on  the  various  Transformed  Problems. 
Although  we  considered  six  different  types  of  Transformed  Problems,  there 
were  only  four  different  ways  finite  differences  were  used:  i)  directly 
on  boundary-value  problems  (in  the  original  problem,  in  quasi-linearized 
problems,  and  in  continuation  problems);  il)  on  variational  problems; 
ili)  on  initial-value  problems  (in  shootlne  and  its  variants,  and  in 
embedding  methods);  and  Iv)  on  integral  equations.  We  will  see  in  this 
Subsection  that  the  projection  approach  to  discrete  modeling  gives  different 
models  in  the  same  four  ways.  First  we  sketch  the  projection  idea  itself. 
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The  projection  approach  can  be  viewed  as  a way  to  approximate  the 
solution  to  an  equation 

(3.14)  Dx  « F(x) 

where  D is  a linear  onerator  from  some  linear  vector  space  X into 

some  linear  vector  space  Y and  F is  a possibly  nonlinear  operator  from 

X into  Y.  We  choose  X^  to  be  some  finite  dimensional  subspace  of  X 

and  we  let  Y,  be  the  finite-dimensional  subspace  of  Y defined  by 
h 

Yj^  = DX^.  Finally  let  be  some  linear  projection  of  Y into  Y^,  so 

that  is  a linear  operator  for  which  “ Y^i 

The  main  idea  is  to  seek  an  approximate  solution  x,  to  (3.14)  in  X, 

h h 

rather  than  in  X;  if  x,  is  in  X,  however  then  Dx,  is  in  Y,  while 

h h h h 

generally  not  in  Y^^  so  that  (3.14)  cannot  be  solved  in  X^. 

Instead  we  modify  F'(Xj^)  to  P|^F(Xj^)  to  get  it  into  Y^^.  Thus  we 
approximate  x solving  (3.14)  by  x^^  solving 

(3.15)  Dx^  - • 

a finite-dimensional  problem.  General  theorems  are  known  on  the  existence 
of  and  on  the  error  x - x^  ; these  theorems  involve  the  approxima- 

tion properties  of  the  subspaces  X^^  and  Y^^  and  the  precise  nature  of 

the  projection  P,  • 
h 

For  our  problems,  X and  Y are  spaces  of  functions  defined  on 
0^  til,  and  elements  y^^  of  Y^  are  usually  represented  as  linear 
combinations 

yh-ai»i+a2»2+ ••• +3^0-^ 

of  simple  basis  functions  for  Yj^  of  dimension  L,  so  that 

(3.15)  defines  a set  of  equations  in  the  unknows  a. , . . . ,a, . 

^ L« 
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Important  choices  for  the  subspaces  and  are  the  spline  spaces 

S(n,k,r) . Here  0 is  a set  of  break  points  0 < < ’ • • < = 0. 

and  k and  r are  integers  with  kiO,  r--l.  An  element  of  S(n,k,r) 

is  a r times  continuously  differentiable  (often  vector-valued)  function 

on  O^tSl  which  is  defined  by  a (different)  polynomial  of  degree  at 

most  k-1  on  each  Interval  ^ sufficiently  smooth  function 

f on  05t$l  can  usually  be  approximated  by  some  spline  a in  S(n,k,r) 

to  order  k in  the  sense  that  max  |f(t)  -o(t)|=  fl^(|n|  ) where 

Olt  1 1 

|nl=  max  Kj+i  ~ I • Important  fact  is  that  every  element  of 

S(n,k,r)  can  be  written  as  a linear  combination  of  special  basis  splines 
(B-splines)  each  of  which  vanishes  identically  on  all  but  a few  adjacent 
Intervals  ^ * Such  a basis  is  called  a local  B-spline  basis. 

In  practice  the  projection  into  is  usually  defined  by  1) 

collocation  conditions,  ii)  orthogonality  (or  Galerkln)  conditions,  or 
ill)  a mixture  of  collocation  and  orthogonality  conditions.  Collocation 
conditions  are  of  the  form  (P^y)(ri)  =y(ri)  for  certain  values  n ; 
orthogonality  (or  Galerkin)  conditions  are  of  the  form  ^Pj^y-y.'f 0 where 
is  some  function  and  denotes  some  inner  product  (usually 

involving  integrals)  on  our  function  spaces.  We  proceed  now  to  discribe 
briefly  some  projection  methods. 

Projection  for  the  Original  Problem,  for 
Quasilinearization,  and  for  Continuation 

As  we  saw  in  Subsection  3.1,  the  Tranformed  Problems  in  these  three 
cases  all  are  still  explicitly  described  as  two  point  boundary-value 
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problems;  to  be  specific,  consider 


y'(t)  = f(t,y(t))  for  0<t<l,  B„y(0)  + B,y(0)  - 0 , 

where  we  assume  homogeneous  boundary  conditions  for  convenience.  Here 
we  can  think  of  the  space  X as,  say  C^[0,1],  Y as  C[0,l],the  operator 
D as  ^ , and  the  operator  F as  taking  y into  f(t,y(t)).  If  we 
take  to  be  the  spline  subspace  S(Il,k,r)  subject  to  our  homogeneous 

boundary  conditions  above,  then  = DXj^  is  essentially  S (II , k-1 , r-1 ) 
(subject  to  some  boundary  conditions).  Therefore  if  Pj^  denotes  some 
projection  into  this  modified  S(n,k-l,r-l) , our  projection  reduces  to 
finding  a In  5(11, k,r)  satisfying 

If,  for  example,  is  defined  by  some  collocation  conditions  (P^y)(n)* 

y(n).  then  we  impose 

(3.15)  o'  (n)  = I (n.o(n)) ; 

similarly  an  orthogonality  condition  might  take  the  form 

(3.16)  f\o’(t)  - f(t,o(t))]4'(t)dt -0  . 

Jq  - 


Note  that  if  o is  expressed  as  a linear  combination  of  local  basis  elements 
(B-spllnes) , 


then  (3.15)  for  example  becomes 
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(3.17)  a^e'[(n)  + ■•' +aj^e'[(n)  - f(n,a^2^^(n)  + ••• +a^gj^(n))  . 

Since  g^(n)  “ Oj^Cn)  ■ 0 except  for  only  a few  subscripts  1 because  the 
O's  form  a local  basis,  only  a very  few  of  the  coefficients  a^  are 
explicitly  involved  in  each  collocation  equation.  This  Imolles  that 
each  equation  in  the  system  (3.15)  for  a^,...,a^  in  fact  only  Involves 
a few  a^;  this  sparsity  is  what  makes  local  bases  Important.  Approxi- 
mating y by  elements  of  8(11, W,r),  if  is  chosen  carefully  by  carefully 

selecting  collocation  points  n or  orthogonality  functions  ^ , gives 
errors  ^ -g  of  optimal  order  k for  05t5l.  In  addition  there  are 
usually  special  points  in  each  interval  £ t ^ ^i+1  grid  n at 

which  much  higher  accuracy  is  obtained  [Dupont  (1976));  this  superconvergence 
can  be  used  to  generate  global  approximations  of  this  higher  order,  so 
such  results  are  important. 

Projection  for  Shooting  and  Embedding 

The  Important  feature  of  the  Transformed  Problems  produced  by  shooting 
or  embedding  is  that  they  are  initial-value  problems.  The  only  effect  this 
has  on  the  discussion  just  completed  is  to  change  the  side  conditions  on 
the  splines  to  initial  rather  than  boundary  conditions.  In  other  regards, 
collocation  and  Galerkin  projection  methods  look  the  same  here  as  for 
problems  with  explicit  boundary  conditions.  Thus  we  consider  these  no  further. 

Projection  for  Integral  Equations 

If  the  Green's  function  is  used  to  transform  our  original  problem, 
say  of  second-order,  we  end  up  with  an  integral  equation 
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} 

I 

1 


y(t) 


[ G(t, 


T)f (T,y(T))dT. 


In  this  case  we  can  take  X and  Y to  be  C[0,1],  D=I,  and  F as  the 
mapping  from  y to  G(t,T)f (•t,y(T))dT.  Using  splines  a to  approxl- 
mate  y and  collocation,  for  example,  to  define  our  projection  gives  us 
conditions  like 


a(n) 


f G(n, 

Jo 


T)f (T,o(T))dT. 


Note  in  this  case  that  even  if  we  represent  0 in  terms  of  local  basis 
functions  fT,  the  resulting  problem  is  not  sparse  because  of  the  terms 
j G(ri,T) f (t ,aj^C^(T)  + • • • + aj^efj^(T) )dT  which  have  contributions  from  every  a^. 


Projection  for  Variational  Problems 


Although  projection  as  I have  described  it  does  not  strictly  apply 
to  variational  problems,  the  spirit  of  projection  does  apply.  Recall  from 
Subsection  2.2  that  we  are  considering  the  problem  of  minimizing 

Jly]  - |\  |(y'(t))^  + F(t,y(t))}dt 


subject  to 


y(0)  - 0.  y(l)  - 0 . 


One  of  the  ideas  behind  projection  was  to  seek  an  approximate  solution  in 
a finite-dimensional  subspace.  Applying  the  same  idea  here  we  replace  y 
by  +*’*+a  ff  for  some  chosen  functions  ff  , ...,0  and  then  choose 

X i.  1j  Xu 


i 

I 


minimize 
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J(a^, . . . ,3^^)  « J[aj^Oj^+ ••• +a^Oj^]  , 


This  is  commonly  called  the  Rayleigh-Rltz  method.  It  is  in  fact  strongly 

related  to  a projection  method  since  extremizing  J is  strongly  related 

to  making  VJ*0;  this  is  the  same  as  using  projection  with  orthogonality 

fl 

conditions  determined  by  0 and  the  inner  product  (O.  ,g^“  0 (t)g(t)dt. 

A.  SOLUTION  TECHNIQUES  FOR  DISCRETE  MODELS 

This  Section  contains  much  less  detail  than  its  predecessors.  Pri- 
marily I want  to  emphasize  the  fact  that  transforming  a problem  (as  in 
Section  2)  and  then  developing  a finite-dimensional  discrete  model  of  the 
Transformed  Problem  (as  in  Section  3)  do  not  a method  make!  There  still 
remains  the  formidable  task  of  solving  for  the  solution  of  the  discrete 
problem,  and  there  are  usually  very  many  computational  techniques  for  doing 
this;  only  when  the  solution  technique  is  known  is  the  complete  algorithm 
for  the  boundary-value  problem  finally  specified.  Each  of  our  discrete 
models  produced  either  a finite-dimensional  minimization  problem,  or  a finite 
system  of  nonlinear  algebraic  equations,  or  a finite  system  of  linear 
algebraic  equations. 

Quite  a number  of  distinct  methods  are  available  for  minimizing  a 
function  of  several  variables  [Murray  (1972)],  the  problem  which  results 
from  discrete  models  of  the  variational  version  of  boundary-value  problems; 
conjugate  direction  and  variable  metric  methods  are  among  the  most  powerful. 
Because  our  problems  often  have  special  structure,  such  as  having  many 
variables  and  a sparse  Hessian,  techniques  should  be  used  which  are  designed 
to  make  use  of  such  structure. 
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Similarly  many  methods  are  available  [Ortega-Rheinboldt  (1970)1 
for  solving  the  finite  systems  of  nonlinear  algebraic  equations  which 
result  from  the  original  problem,  the  shooting,  and  the  integral  equation 
approaches  to  boundary-value  problems;  among  the  most  popular  are  the 
quasi-Newton  update  methods  which  essentially  use  Newton's  method  only 
with  rough  approximations  to  the  Jacobian  matrix  of  the  nonlinear  system. 

The  systems  arising  from  the  original  problem  and  from  the  Integral  equa- 
tion approaches  usually  have  many  more  unknowns  than  in  the  shooting 
systems.  Systems  for  the  original  problem  are  usually  sparse  while  those 
for  the  integral  equation  are  usually  dense.  Again  special  methods  should 
be  used  depending  on  the  system's  structure. 

Since  systems  of  linear  algebraic  equations  often  arise  from  methods 
to  solve  nonlinear  equations  as  well  as  from  linear  differential  equations 
with  linear  boundary  conditions,  methods  for  solving  linear  algebraic 
systems  are  fundamental  to  solution  techniques  for  our  discrete  models. 

Again,  although  we  often  think  only  of  straightforward  Gauss  elimination, 
there  are  many  techniques  available  for  solving  linear  systems.  A wide 
variety  of  iterative  methods  can  be  used,  especially  for  sparse  problems. 

In  addition,  there  are  many  choices  as  to  how  to  perform  direct  elimination. 
Consider,  for  example,  finite  differences  for  the  original  problem  which 
leads  to  a linear  system  as  in  (3.6)  that  is  almost  block  bi-diagonal. 
Different  procedures  result  from  considering  the  matrix  as  full,  as  banded, 
as  block  banded,  et  cetera;  these  differ  primarily  in  how  much  Information 
is  retained  about  the  location  of  zeroes.  The  work  required  and  the  accuracy 
obtained  will  also  differ  among  the  procedures,  and  it  is  Important  to 
determine  which  procedures  are  "best"  for  solving  a given  discrete  problen. 
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One  of  the  important  practical  areas  of  Investigation  now  is  that 
of  determining  which  solution  technique  should  be  used  on  a given  discrete 
model.  This  is  a vital  area  since  the  efficiency  and  accuracy  of  the 
solution  technique  can  make  or  break  a complete  algorithm;  I have  heard 
of  a high-quality  multiple-shooting  code  which  was  improved  by  a factor 
of  ten  by  Improving  its  nonlinear  equation  solver. 

^ RELATIONSHIPS  AMONG  COMPLETE  ALGORITHMS 

We  have  Indicated  that  a complete  algorithm  is  specified  by  three 
"co-ordinates":  a Transformed  Problem,  a Discrete  Model,  and  a Solution 
Technique.  Unfortunately,  different  sets  of  co-ordinates  can  describe 
identical  (or  very  similar)  complete  algorithms.  In  this  Section  I want 
to  indicate  a few  relationships  among  such  algorithms.  I will  not  compare 
algorithms  in  the  sense  of  saying  which  is  "better",  since  that  question 
can  only  be  addressed  in  terms  of  specific  computer  codes  implementing 
the  algorithms,  specific  sets  of  test-problem  classes,  and  specific 
criteria  for  measuring  "goodness";  there  is  a great  need  for  such  compari- 
sons to  be  performed  with  the  same  care  that  went  into  the  testing  of 
codes  for  initial-value  problems  [Hull  (1975),  Hull  et  al.  (1972), 
Davenport  et  al . (1975)]. 

We  remarked  earlier  on  one  equivalence  among  algorithms.  In  using 
the  Rayleigh-Ritz  discrete  model  for  the  variational  formulation  of  the 
problem,  if  we  set  equal  to  zero  the  gradient  of  the  function  of  finitely 
many  variables  to  be  minimized  we  obtain  a Galerkin  projection  model  for 
the  original  boundary-value  problem.  Similarly  if  we  set  to  zero  the 
gradient  of  the  function  to  be  minimized  in  the  finite-difference  method 


of  the  variational  formulation,  then  we  obtain  a finite-difference  model 


of  the  original  problem. 

It  has  also  been  shown  that  c'ertain  spline  collocation  procedures 
are  identical  to  finite-difference  procedures  when  both  are  applied  to 
the  original  problem.  For  example,  using  the  spline  space  S(n,2,0)  of 
continuous  piecewise  linear  functions  on  the  first-order  problem  (1.1) 
with  collocation  at  the  middle  of  interval  between  break  points  yields 
precisely  the  finite-difference  equations  (3.2)  if  the  points  t^  are 
the  break-points  and  denotes  the  value  of  the  spline  at  t^. 

At  present  it  seems  to  be  generally  believed  that  the  most  competi- 
tive methods  for  boundary-value  problems  are  based  on  projection  for  the 
original  problem,  finite  differences  for  the  original  problem,  finite 
differences  for  shooting  and  its  variants,  and  finite  differences  for 
embedding;  I therefore  want  to  look  briefly  at  the  relationships  among 
these  procedures.  We  have  already  seen  a relationship  between  finite 
differences  and  projection  for  the  original  problem,  so  I want  to  examine 
finite  differences  for  the  original  problem,  for  shooting,  and  for 
embedding.  For  linear  problems  the  relationships  are  very  striking,  since 
the  overall  methods  can  be  identical!  For  nonlinear  problems  the  same 
methods  are  not  necessarily  Identical  but  are  very  similar.  Simply  to 
convey  the  idea  here  we  look  at  linear  first-order  problems. 

Consider  first  simple  shooting  for  the  linear  scalar  problem 

y'  - A(t)y  + g(t),  BQy(O)  + B^y(l)  - e 

where  we  implement  shooting  by  the  simple  finite-difference  method  (3.2); 
letting  approximate  y(t^),  we  get  the  recursion  formulas  appearing 

just  before  (3.6).  Suppose,  to  be  precise,  we  take  six  points 
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► «0<t<t<t<t,<  t,<  t “1.  Then  for  simple  shooting  we  try  to 

0 1 2 3 4 5 6 

find  z so  that  choosing 

Zi  = z.  for  1<1<5,  B^z  + B^Z^  - e 

In  shooting  we  solve  the  above  recursion  for  in  terms  of  2 3od 

then  use  this  plus  the  boundary  condition  B^z+B^Z^^e  to  select  z 
correctly.  If  we  write  the  above  recursion  and  boundary  condition  in 
matrix  notation,  we  obtain 


which  is  precisely  (3.6),  the  equations  we  solved  for  finite  differences 
applied  to  the  original  problem.  Thus  finite  differences  for  simple 


shooting  and  for  the  original  problem  give  the  same  answers.  Moreover, 


we  can  Interpret  the  way  in  which  shooting  solves  for  Z,  in  terms  of 

o 

z " Zj  in  the  language  of  an  elimination  method  for  solving  (3.6)  or  (5.1) 
for  finite  differences  on  the  original  problem.  In  (5.1),  use  the  (2,2) 
element  to  eliminate  the  (3,2)  element-Pj  and  divide  the  second  row 

by  the  (2,2)  element.  Next  use  the  (3,3)  element  to  eliminate  the  (4,3) 
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element  and  divide  row  three  by  the  (3,3)  element.  Keeping  this  up 
we  eventually  transform  the  matrix  in  (5.1)  into 

[ B 0 0 0 0 B 

I ^ i 

I 

X 1 0 0 0 0 

X 0 10  0 0 

(5.2) 

X 0 0 10  0 

I X 0 0 0 1 0 

j^X  0 0 0 0 1 

where  X denotes  the  presence  of  some  nonzero  element.  The  last  row 
of  (5.2)  exp.resses  in  terms  of  as  In  shooting.  If  we  now 

solve  for  Zj^  between  the  first  and  last  rows  in  (5.2)  and  substitute 
the  computed  Zj^  into  the  equations  (5.2),  we  obtain  all  the  values 
Z^,  precisely  as  in  shooting.  Therefore,  not  only  do  we  produce  the 
same  solutions  bv  the  two  procedures,  but  also  finite  differences  for 


simple  shooting  on  linear  problems  is  computationally  step  by  ste 
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By  a slight  generalization  on  the  preceding  argument  we  can  show 
that  finite  differences  for  multiple  shooting  is  identical  step  by 
step  with  an  elimination  method  for  the  finite-difference  method  for 
the  boundary-value  problem.  Likewise,  both  simple  and  multiple  super- 
position are  Identical  with  an  elimination  process.  Somewhat  more 
complex  is  the  fact  that  the  re-orthogonallzation  process  implemented 
with  finite  differences  is  identical  step  by  step  with  a solution 
technique  for  the  finite  difference  equations  (3.6)  for  the  original 
problem;  this  solution  technique  first  eliminates  as  for  multiple 
shooting  and  then  performs  an  orthogonal  simularlty  transformation 
(based  on  the  matrices  in  (2.8))  so  as  to  make  all  of  the  entries 

in  the  matrix  upper-triangular  as  in  (2.8).  Also  it  can  be  shown  that 
finite  differences  on  the  sweep  method  of  embedding  is  identical  with 
standard  Gauss  elimination  in  (3.6). 

Thus  finite  differences  for  the  original  problem,  for  shooting  and 
its  variants  and  for  embedding  only  differ  by  being  different  solution 
techniques  for  the  same  set  of  equations  (3.6).  This  does  not  mean 
that  the  methods  are  not  very  different;  different  solution  techniques 
can  have  drastically  different  results  in  the  presence  of  rounding  error. 
Vfhat  our  statement  does  mean  is  that  it  is  reasonable  to  concentrate  on 
alternatives  to  Gauss  elimination  in  order  to  solve  (3.6).  Again  I 
observe  also  that  finite-difference  methods  for  shooting  and  embedding 
have  the  ability  to  select  grid  points  dynamically,  while  the  finite- 
difference  method  for  the  original  problem  selects  grid  points  in  advance. 

6.  ACCELERATING  CONVERGENCE  AND  ESTIMATING  ERRORS 


For  simplicity  we  begin  the  discussion  of  this  topic  in  a simpler 
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setting  than  differential  equations.  Suppose  that  there  is  some 
marvelous  number  Yq  that  we  wish  to  compute,  and  that  as  some  posi- 
tive parameter  h tends  to  zero  we  are  Instead  able  to  compute  some 
approximation  Y{h)  to  Y^.  The  problem  of  error  estimation  is 
obviously  that  of  estimating  the  size  of  the  error  Y(h)  -Yq;  the  problem 
of  acceleration  is  that  of  generating  another  scheme  Y(h)  for  which 
its  error  Y(h)-YQ  is  "much"  smaller  than  Y(h)-YQ.  The  two  problems 
are  closely  related:  if  e(h)  is  an  accurate  estimate  of  Y(h)-YQ 
then  surely  Y(h)  =Y(h)-e(h)  is  an  accelerated  estimate  of  Y^^YCh)- 
(Y(h)-YQ),  while  if  some  Y(h)  is  "much"  nearer  Y^  than  is  Y(h) 
then  surely  e(h)  =Y(h)-Y(h)  is  very  near  Y(h)-YQ  which  is  the  true 
error.  I will  first  phrase  my  discussion  in  terms  of  accelerating 
convergence. 

Three  convergence-acceleration  devices  which  have  been  used  for 
boundary-value  problems  are  Richardson  extrapolation  [Joyce  (1971)], 
iterated  deferred  correction  [Pereyra  (1967)],  and  Iterated  defect 
correction  [Frank  (1976)].  Richardson  extrapolation  computes  both  Y(h) 
and  Y(rh)  for  some  r<l  and  uses  some  theoretical  information  on  the 
behavior  of  Y(h)-YQ  in  order  to  compute  an  improved  Y(h);  for  differ- 
ential equations  this  involves  computations  on  two  discretizations  in 
order  to  improve  the  accuracy  on  the  less  accurate  of  the  two  solutions 
(the  one  with  the  more  crude  discretization).  Both  deferred  correction 
and  defect  correction  are  much  more  complicated  than  Richardson  extra- 
polation but  have  the  advantage  of  avoiding  computations  on  a refined 
discretization.  Experiments  Indicate  the  deferred  correction  and  defect 
correction  are  more  efficient  than  Richardson  extrapolation,  but  the 
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methods  are  too  complex  to  explain  here. 

There  are  a number  of  other  methods  for  estimating  errors  Y(h)-YQ, 
most  of  which  require  knowledge  of  the  asymptotic  nature  of  the  error. 

For  example,  if  it  is  known  that 

Y(h)-YQ  - T(Y)hP  +0(hP^^) 

for  known  p > 0 and  for  a known  expression  T(Y)  in  Y,  then  one  can 
first  compute  Y(h)  and  then  estimate  the  error  by  Tj^(Y(h))h^  where 
Tj^(y)  denotes  some  approximation  scheme  for  estimating  T(Y),  such  as 
a divided  difference  to  approximate  a derivative. 

Returning  to  the  discussion  of  differential  equations,  what  we  really 
want  is  an  estimate  of  the  error  in  approxlmalng  a solution  y(t)  at 
each  t;  several  of  the  estimation  schemes  suggested  above  have  been  used 
to  do  just  that  for  discrete  models  of  the  original  boundary-value  problem. 
In  shooting  and  embedding  we  are  solving  initial-value  problems,  and  most 
codes  for  such  problems  estimate  the  local  or  one-step  error  rather  than 
the  global  or  total  error  we  desire.  Clearly  the  two  errors  are  related, 
but  it  is  not  fair  to  say,  as  some  have,  that  error  estimation  is  easier 
for  initial-value  problems;  the  fallacy  of  the  statement  comes  from 
measuring  two  different  errors. 

7.  ERROR  CONTROL  AND  PARAMETER  SELECTION 

There  would  be  little  point  to  Section  6 on  estimating  error  if  we 
had  no  use  for  the  estimate.  Vfhat  we  usually  want  is  to  control  the 
error,  in  the  sense  that  ve  want  to  make  the  error  less  than  some  tolerance 
provided  by  a user  of  our  method  at  as  low  a cost  as  possible;  usually 
this  means  controlling  the  error  so  that  it  will  be  only  a little  smaller 
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than  the  tolerance.  In  our  little  mythical  example  in  Section  6 of 
computing  something  called  Y(h)  to  approximate  Y^,  once  h is 
chosen  the  error  is  determined;  thus  to  control  the  error  in  any  way  we 
must  appropriately  select  the  value  of  our  parameter  h.  In  the  real 
methods  we  discussed  in  Sections  2,3,  and  A,  there  are  many  parameters 
at  our  disposal.  The  first,  of  course,  are  the  Transformed  Problem, 
the  Discrete  Model,  and  the  Solution  Technique  we  choose  to  use.  After 
choosing  these,  we  still  face  many  parameters.  For  example:  with 
finite  differences,  how  many  points  t^  to  use  and  where  to  place  them, 
and  what  difference  approximations  to  use;  for  spline  collocation,  how 
many  break  points  to  use  and  where  to  place  them,  where  to  place  collo- 
cation points,  what  degree  and  how  smooth  splines  to  use;  for  shooting, 
how  many  shooting  points  to  use  and  where  to  place  them;  et  cetera. 

Some  methods  appear  to  require  a uniform  spacing  of  some  mesh  (break 
points  or  collocation  points  or  finite  difference  points);  this  can  be 
disastrous  on  problems  whose  solutions  change  slowly  in  some  region 
and  very  rapidly  in  others.  Selecting  the  mesh  in  this  case  is  very 
difficult;  an  Interesting  idea  (Russell-Christiansen  (1978)]  is  to  use 
different  uniform  meshes  on  various  subintervals  of  0<t  <1  and  to 
stop  computing  in  a region  of  the  interval  in  which  the  tolerance  has 
been  met.  While  the  user  of  a computer  code  can  sometimes  wisely  select 
parameters,  in  many  cases  a good  choice  of  parameters  depends  on  proper- 
ties of  the  solution  about  which  the  user  has  no  ideas.  For  this  reason 
an  Important  trend  in  code  development  is  the  inclusion  of  procedures 
which  automatically  select  parameters  in  an  attempt  to  attain  the  desired 
error  efficiently.  This  Is  a vital  research  problem  on  which  some 


progress  is  being  made  but  where  much  remains  to  be  done. 


AO 


8.  CONCLUSIONS 

My  aim  in  this  paper  has  been  to  explain  briefly  what  each  of  a 
variety  of  methods  is,  how  methods  relate  to  one  another,  and  where  are 
the  difficulties  today  that  stimulate  interesting  research  problems. 

To  describe  what  the  methods  are  and  how  they  relate  we  viewed  each 
method  as  a Solution  Technique  for  some  Discrete  Model  of  a Transformed 
Problem;  this  was  done  in  the  setting  of  two  simple  model  problems 
(1.1),  (1.2)  but  extends  readily  to  most  other  types  of  boundary-value 
problems  involving  eigenvalues,  multi-point  boundary  conditions,  et 
cetera.  Those  areas  which  in  my  opinion  deserve  much  more  study  and 
development  Include:  numerical  effects  and  efficiency  of  different 
methods  of  solving  the  linear  algebraic  systems  that  arise;  methods  for 
solving  the  special  nonlinear  algebraic  systems  that  arise;  comparative 
performance  of  codes  Implementing  various  methods  on  carefully  chosen 
classes  of  test  problems;  and  methods  for  estimating  and  controlling 
global  errors  by  automatic  selection  of  parameters  of  the  method. 
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